process extraction
PET: An Annotated Dataset for Process Extraction from Natural Language Text
Bellan, Patrizio, van der Aa, Han, Dragoni, Mauro, Ghidini, Chiara, Ponzetto, Simone Paolo
Process extraction from text is an important task of process discovery, for which various approaches have been developed in recent years. However, in contrast to other information extraction tasks, there is a lack of gold-standard corpora of business process descriptions that are carefully annotated with all the entities and relationships of interest. Due to this, it is currently hard to compare the results obtained by extraction approaches in an objective manner, whereas the lack of annotated texts also prevents the application of data-driven information extraction methodologies, typical of the natural language processing field. Therefore, to bridge this gap, we present the PET dataset, a first corpus of business process descriptions annotated with activities, gateways, actors, and flow information.
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
- Europe > Germany (0.04)
NLP4PBM: A Systematic Review on Process Extraction using Natural Language Processing with Rule-based, Machine and Deep Learning Methods
Van Woensel, William, Motie, Soroor
This literature review studies the field of automated process extraction, i.e., transforming textual descriptions into structured processes using Natural Language Processing (NLP). We found that Machine Learning (ML) / Deep Learning (DL) methods are being increasingly used for the NLP component. In some cases, they were chosen for their suitability towards process extraction, and results show that they can outperform classic rule-based methods. We also found a paucity of gold-standard, scalable annotated datasets, which currently hinders objective evaluations as well as the training or fine-tuning of ML / DL methods. Finally, we discuss preliminary work on the application of LLMs for automated process extraction, as well as promising developments in this field.
- Europe > Germany (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Switzerland (0.04)
- (8 more...)
- Overview (1.00)
- Research Report > New Finding (0.88)
- Government (0.68)
- Health & Medicine (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Process Extraction from Texts via Multi-Task Architecture
Qian, Chen, Wen, Lijie, Long, MingSheng, Li, Yanwei, Kumar, Akhil, Wang, Jianmin
Process extraction, a recently emerged interdiscipline, aims to extract procedural knowledge expressed in texts. Previous process extractors heavily depend on domain-specific linguistic knowledge, thus suffer from the problems of poor quality and lack of adaptability. In this paper, we propose a multi-task architecture based model to perform process extraction. This is the first attempt that brings deep learning in process extraction. Specifically, we divide process extraction into three complete and independent subtasks: sentence classification, sentence semantic recognition and semantic role labeling. All of these subtasks are trained jointly, using a weight-sharing multi-task learning (MTL) framework. Moreover, instead of using fixed-size filters, we use multiscale convolutions to perceive more local contextual features. Finally, we propose a recurrent construction algorithm to create a graphical representation from the extracted procedural knowledge. Experimental results demonstrate that our approach can extract more accurate procedural information than state-of-the-art baselines.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)